Second-Order Step-Size Tuning of SGD for Non-Convex Optimization
نویسندگان
چکیده
In view of a direct and simple improvement vanilla SGD, this paper presents fine-tuning its step-sizes in the mini-batch case. For doing so, one estimates curvature, based on local quadratic model using only noisy gradient approximations. One obtains new stochastic first-order method (Step-Tuned SGD), enhanced by second-order information, which can be seen as version classical Barzilai-Borwein method. Our theoretical results ensure almost sure convergence to critical set we provide rates. Experiments deep residual network training illustrate favorable properties our approach. such networks observe, during training, both sudden drop loss an test accuracy at medium stages, yielding better than RMSprop, or ADAM.
منابع مشابه
Natasha 2: Faster Non-Convex Optimization Than SGD
We design a stochastic algorithm to train any smooth neural network to ε-approximate local minima, using O(ε−3.25) backpropagations. The best result was essentially O(ε−4) by SGD. More broadly, it finds ε-approximate local minima of any smooth nonconvex function in rate O(ε−3.25), with only oracle access to stochastic gradients. ∗V1 appeared on arXiv on this date. V2 and V3 polished writing. Th...
متن کاملOracle Complexity of Second-Order Methods for Smooth Convex Optimization
Second-order methods, which utilize gradients as well as Hessians to optimize a given function, are of major importance in mathematical optimization. In this work, we study the oracle complexity of such methods, or equivalently, the number of iterations required to optimize a function to a given accuracy. Focusing on smooth and convex functions, we derive (to the best of our knowledge) the firs...
متن کاملSecond order sensitivity analysis for shape optimization of continuum structures
This study focuses on the optimization of the plane structure. Sequential quadratic programming (SQP) will be utilized, which is one of the most efficient methods for solving nonlinearly constrained optimization problems. A new formulation for the second order sensitivity analysis of the two-dimensional finite element will be developed. All the second order required derivatives will be calculat...
متن کاملSparse Second Order Cone Programming Formulations for Convex Optimization Problems
Second order cone program (SOCP) formulations of convex optimization problems are studied. We show that various SOCP formulations can be obtained depending on how auxiliary variables are introduced. An efficient SOCP formulation that increases the computational efficiency is presented by investigating the relationship between the sparsity of an SOCP formulation and the sparsity of the Schur com...
متن کاملA second-order pruning step for verified global optimization
We consider pruning steps used in a branch-and-bound algorithm for veri ed global optimization. A rst-order pruning step was given by Ratz using automatic computation of a rst-order slope tuple [21, 22]. In this paper, we introduce a second-order pruning step which is based on automatic computation of a second-order slope tuple. We add this second-order pruning step to the algorithm of Ratz. Fu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Processing Letters
سال: 2022
ISSN: ['1573-773X', '1370-4621']
DOI: https://doi.org/10.1007/s11063-021-10705-5